3574 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
CreativeCommons
Size:
15.3 MByte Production Status:
Existing-used
Use:
Machine Learning
-
Paper title:Par4Sim -- Adaptive Paraphrasing for Text Simplification
-
Paper track:NLP engineering experiment paper
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Seid Muhie Yimam | Universität Hamburg | DE | ||
| Author 2 | Chris Biemann | TU Darmstadt | DE | University of Hamburg | DE |
| Main Contact | Seid Muhie Yimam | Universität Hamburg | None |
Documentation:
https://www.inf.uni-hamburg.de/en/inst/ab/lt/resources/data/complex-word-identification-dataset.htmlLanguage Type:
Multilingual
Languages:
English
Availability:
From Owner
License:
CreativeCommons
Size:
20 GByte Production Status:
Newly created-in progress
Use:
Dialogue
-
Paper title:Chats and Chunks: Annotation and Analysis of Multiparty Long Casual Conversations
-
Paper track:Speech
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Emer Gilmartin | Trinity College Dublin | IE |
| Author 2 | Carl Vogel | Trinity College Dublin | IE |
| Author 3 | Nick Campbell | Trinity College Dublin | IE |
| Main Contact | Emer Gilmartin | Trinity College Dublin | None |
Documentation:
Annotation Manual
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Open For Reuse With Restrictions
Size:
808 KByte Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Using a Cross-Language Information Retrieval System based on OHSUMED to Evaluate the Moses and KantanMT Statistical Machine Translation Systems
-
Paper track:Evaluation
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Nikolaos Katris | Department of CSIS, University of Limerick | IE |
| Author 2 | Richard Sutcliffe | School of CSEE, University of Essex | GB |
| Author 3 | Theodore Kalamboukis | Athens University of Economics and Business | GR |
| Main Contact | Nikolaos Katris | Department of CSIS, University of Limerick | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
<Not Specified>
Size:
24 GByte Production Status:
Existing-used
Use:
Machine Learning
-
Paper title:Embedding Words as Distributions with a Bayesian Skip-gram Model
-
Paper track:NLP engineering experiment paper
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Arthur Bražinskas | University of Amsterdam, ILLC | N/A |
| Author 2 | Serhii Havrylov | University of Edinburgh, Institute for Language, Cognition and Computation | GB |
| Author 3 | Ivan Titov | University of Edinburgh / University of Amsterdam | GB |
| Main Contact | Serhii Havrylov | University of Edinburgh, Institute for Language, Cognition and Computation | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
<Not Specified>
Size:
12 <Not Specified>Production Status:
Newly created-in progress
Use:
Information Extraction, Information Retrieval
-
Paper title:TIMEN: An Open Temporal Expression Normalisation Resource
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Hector Llorens | <Not Specified> | None |
| Author 2 | Leon Derczynski | <Not Specified> | None |
| Author 3 | Robert Gaizauskas | <Not Specified> | None |
| Author 4 | Estela Saquete | <Not Specified> | None |
| Main Contact | Leon Derczynski | University of Sheffield | GB |
Documentation:
<Not Specified>Language Type:
Trilingual
Languages:
Egyptian Arabic English Mandarin Chinese
Availability:
The Data Will Be Published Via LDC General Catalogue
License:
<Not Specified>
Size:
1936987 words Production Status:
Newly created-finished
Use:
Anaphora, Coreference
-
Paper title:Large Multi-lingual, Multi-level and Multi-genre Annotation Corpus
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Xuansong Li | Linguistic Data Consortium, University of Pennsylvania | US | ||
| Author 2 | Martha Palmer | Department of Linguistics and Computer Science, University of Colorado | US | ||
| Author 3 | Nianwen Xue | Computer Science Department, Brandeis University | US | ||
| Author 4 | Lance Ramshaw | Raytheon BBN Technologies | US | ||
| Author 5 | Mohamed Maamouri | <Not Specified> | None | Linguistic Data Consortium, University of Pennsylvania | US |
| Author 6 | Ann Bies | <Not Specified> | None | Linguistic Data Consortium, University of Pennsylvania | US |
| Author 7 | Kathryn Conger | Department of Linguistics and Computer Science, University of Colorado | US | ||
| Author 8 | Stephen Grimes | Linguistic Data Consortium, University of Pennsylvania | US | ||
| Author 9 | Stephanie Strassel | Linguistic Data Consortium, University of Pennsylvania | US | ||
| Main Contact | Xuansong Li | Linguistic Data Consortium, University of Pennsylvania | None |
Documentation:
<Not Specified>Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
FBK
Size:
40000 <Not Specified>Production Status:
Newly created-in progress
Use:
Document Classification, Text categorisation
-
Paper title:A Parallel Corpus of Music and Lyrics Annotated with Emotions
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Carlo Strapparava | <Not Specified> | None | Fondazione Bruno Kessler, Trento | None |
| Author 2 | Rada Mihalcea | <Not Specified> | None | ||
| Author 3 | Alberto Battocchi | <Not Specified> | None | ||
| Main Contact | Carlo Strapparava | FBK-irst | IT |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
English
Availability:
From Owner
License:
LGPL
Size:
<Not Specified> Production Status:
Newly created-in progress
Use:
Text Mining
-
Paper title:Who cares about sarcastic tweets? Investigating the impact of sarcasm on sentiment analysis.
-
Paper track:Written
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Diana Maynard | University of Sheffield | GB |
| Author 2 | Mark Greenwood | University of Sheffield | GB |
| Main Contact | Diana Maynard | University of Sheffield | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
English Japanese
Availability:
From Owner
License:
<Not Specified>
Size:
30 GByte Production Status:
Newly created-in progress
Use:
Language Modelling
-
Paper title:Comparison of Pun Detection Methods Using Japanese Pun Corpus
-
Paper track:Multimodality
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Motoki Yatsu | Aoyama Gakuin University | JP |
| Author 2 | Kenji Araki | Hokkaido University | JP |
| Main Contact | Motoki Yatsu | Aoyama Gakuin University | None |
Documentation:
In English, published as research papers
Speech
Typological Database,
Language Type:
Multilingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons Attribution-ShareAlike 3.0 Unported License
Size:
2155 inventories OtherProduction Status:
Existing-used
Use:
Information Extraction, Information Retrieval
-
Paper title:Defining and Counting Phonological Classes in Cross-linguistic Segment Databases
-
Paper track:Speech
-
Paper status:Accept Poster+DemoSuggested
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Dan Dediu | Max Planck Institute for Psycholinguistics | NL |
| Author 2 | Scott Moisik | Max Planck Institute for Psycholinguistics | NL |
| Main Contact | Dan Dediu | Max Planck Institute for Psycholinguistics | None |
Documentation:
Extensive help




